18 research outputs found

    Prediction of future hospital admissions - what is the tradeoff between specificity and accuracy?

    Full text link
    Large amounts of electronic medical records collected by hospitals across the developed world offer unprecedented possibilities for knowledge discovery using computer based data mining and machine learning. Notwithstanding significant research efforts, the use of this data in the prediction of disease development has largely been disappointing. In this paper we examine in detail a recently proposed method which has in preliminary experiments demonstrated highly promising results on real-world data. We scrutinize the authors' claims that the proposed model is scalable and investigate whether the tradeoff between prediction specificity (i.e. the ability of the model to predict a wide number of different ailments) and accuracy (i.e. the ability of the model to make the correct prediction) is practically viable. Our experiments conducted on a data corpus of nearly 3,000,000 admissions support the authors' expectations and demonstrate that the high prediction accuracy is maintained well even when the number of admission types explicitly included in the model is increased to account for 98% of all admissions in the corpus. Thus several promising directions for future work are highlighted.Comment: In Proc. International Conference on Bioinformatics and Computational Biology, April 201

    Automatic knowledge extraction from EHRs

    Get PDF
    Increasing efforts in the collection, standardization, and maintenance of large scale longitudinal elec- tronic health care records (EHRs) across the world provide a promising source of real world medical data with the potential of providing major novel insights of benefit both to specific individuals in the context of personalized medicine, as well as on the level of population-wide health care and policy. The present paper builds upon the existing and intensifying efforts at using machine learning to provide predictions on future diagnoses likely to be experienced by a particular individual based on the person’s existing diagnostic history. The specific model adopted as the baseline predictive framework is based on the concept of a binary diagnostic history vector representation of a patient’s diagnostic medical record. The technical novelty introduced herein concerns the manner in which transitions between diagnostic history vectors are learnt. We demonstrate that the proposed change prima fasciae enables greater learning specificity. We present a series of experiments which demon- strate the effectiveness of the proposed techniques, and which reveal novel insights regarding the most promising future research directions.Postprin

    Towards sophisticated learning from EHRs : increasing prediction specificity and accuracy using clinically meaningful risk criteria

    Get PDF
    Computer based analysis of Electronic Health Records (EHRs) has the potential to provide major novel insights of benefit both to specific individuals in the context of personalized medicine, as well as on the level of population-wide health care and policy. The present paper introduces a novel algorithm that uses machine learning for the discovery of longitudinal patterns in the diagnoses of diseases. Two key technical novelties are introduced: one in the form of a novel learning paradigm which enables greater learning specificity, and another in the form of a risk driven identification of confounding diagnoses. We present a series of experiments which demonstrate the effectiveness of the proposed techniques, and which reveal novel insights regarding the most promising future research directions.Postprin

    Diagnosis prediction from electronic health records (EHR) using the binary diagnosis history vector representation

    Get PDF
    Large amounts of rich, heterogeneous information nowadays routinely collected by health care providers across the world possess remarkable potential for the extraction of novel medical data and the assessment of different practices in real-world conditions. Specifically in this work our goal is to use Electronic Health Records (EHRs) to predict progression patterns of future diagnoses of ailments for a particular patient, given the patient’s present diagnostic history. Following the highly promising results of a recently proposed approach which introduced the diagnosis history vector representation of a patient’s diagnostic record, we introduce a series of improvements to the model and conduct thorough experiments that demonstrate its scalability, accuracy, and practicability in the clinical context. We show that the model is able to capture well the interaction between a large number of ailments which correspond to the most frequent diagnoses, show how the original learning framework can be adapted to increase its prediction specificity, and describe a principled, probabilistic method for incorporating explicit, human clinical knowledge to overcome semantic limitations of the raw EHR data.PostprintPeer reviewe

    The problem with probabilistic DAG automata for semantic graphs

    Get PDF
    Semantic representations in the form of directed acyclic graphs (DAGs) have been introduced in recent years, and to model them, we need probabilistic models of DAGs. One model that has attracted some attention is the DAG automaton, but it has not been studied as a probabilistic model. We show that some DAG automata cannot be made into useful probabilistic models by the nearly universal strategy of assigning weights to transitions. The problem affects single-rooted, multi-rooted, and unbounded-degree variants of DAG automata, and appears to be pervasive. It does not affect planar variants, but these are problematic for other reasons.Comment: To appear in NAACL-HLT 201

    Overview of Neuromuscular Disorder Molecular Diagnostic Experience for the Population of Latvia

    Get PDF
    Funding Information: The Article Processing Charge was funded by the authors. Publisher Copyright: © American Academy of Neurology.Background and ObjectivesGenetic testing has become an integral part of health care, allowing the confirmation of thousands of hereditary diseases, including neuromuscular disorders (NMDs). The reported average prevalence of individual inherited NMDs is 3.7-4.99 per 10,000. This number varies greatly in the selected populations after applying population-wide studies. The aim of this study was to evaluate the effect of genetic analysis as the first-tier test in patients with NMD and to calculate the disease prevalence and allelic frequencies for reoccurring genetic variants.MethodsPatients with NMD from Latvia with molecular tests confirming their diagnosis in 2008-2020 were included in this retrospective study.ResultsDiagnosis was confirmed in 153 unique cases of all persons tested. Next-generation sequencing resulted in a detection rate of 37%. Two of the most common childhood-onset NMDs in our population were spinal muscular atrophy and dystrophinopathies, with a birth prevalence of 1.01 per 10,000 newborns and 2.08 per 10,000 (male newborn population), respectively. The calculated point prevalence was 0.079 per 10,000 for facioscapulohumeral muscular dystrophy type 1, 0.078 per 10,000 for limb-girdle muscular dystrophy, 0.073 per 10,000 for nondystrophic congenital myotonia, 0.052 per 10,000 for spinobulbar muscular atrophy, and 0.047 per 10,000 for type 1 myotonic dystrophy.DiscussionDNA diagnostics is a successful approach. The carrier frequencies of the common CAPN3, FKRP, SPG11, and HINT1 gene variants as well as that of the SMN1 gene exon 7 deletion in the population of Latvia are comparable with data from Europe. The carrier frequency of the CLCN1 gene variant c.2680C>T p.(Arg894Ter) is 2.11%, and consequently, congenital myotonia is the most frequent NMD in our population.publishersversionPeer reviewe

    Automatic knowledge extraction from EHRs

    No full text
    Increasing efforts in the collection, standardization, and maintenance of large scale longitudinal elec- tronic health care records (EHRs) across the world provide a promising source of real world medical data with the potential of providing major novel insights of benefit both to specific individuals in the context of personalized medicine, as well as on the level of population-wide health care and policy. The present paper builds upon the existing and intensifying efforts at using machine learning to provide predictions on future diagnoses likely to be experienced by a particular individual based on the person’s existing diagnostic history. The specific model adopted as the baseline predictive framework is based on the concept of a binary diagnostic history vector representation of a patient’s diagnostic medical record. The technical novelty introduced herein concerns the manner in which transitions between diagnostic history vectors are learnt. We demonstrate that the proposed change prima fasciae enables greater learning specificity. We present a series of experiments which demon- strate the effectiveness of the proposed techniques, and which reveal novel insights regarding the most promising future research directions
    corecore